A Scheduling Algorithm for Running Bag-of-Tasks Data Mining Applications on the Grid
نویسندگان
چکیده
Data mining applications are composed of computing-intensive processing tasks, which are natural candidates for execution on high performance, high throughput platforms such as PC clusters and computational grids. Besides, some data-mining algorithms can be implemented as Bag-of-Tasks (BoT) applications, which are composed of parallel, independent tasks. Due to its own nature, the adaptation of BoT applications for the grid is straightforward. In this sense, this work proposes a scheduling algorithm for running BoT data mining applications on grid platforms. The proposed algorithm is evaluated by means of several experiments, and the obtained results show that it improves both scalability and performance of such applications.
منابع مشابه
Improving scalability of Bag-of-Tasks applications running on master-slave platforms
0167-8191/$ see front matter 2008 Elsevier B.V doi:10.1016/j.parco.2008.09.013 * Corresponding author. Tel.: +351 217 500 244; E-mail address: [email protected] (F.A.B. da Silv 1 In this paper, we use the terms ‘‘Bag-of-Tasks” a Bag-of-Tasks applications are parallel applications composed of independent tasks. Examples of Bag-of-Tasks (BoT) applications include Monte Carlo simulations, massi...
متن کاملGreen Energy-aware task scheduling using the DVFS technique in Cloud Computing
Nowdays, energy consumption as a critical issue in distributed computing systems with high performance has become so green computing tries to energy consumption, carbon footprint and CO2 emissions in high performance computing systems (HPCs) such as clusters, Grid and Cloud that a large number of parallel. Reducing energy consumption for high end computing can bring various benefits such as red...
متن کاملAn Efficient Genetic Algorithm for Task Scheduling on Heterogeneous Computing Systems Based on TRIZ
An efficient assignment and scheduling of tasks is one of the key elements in effective utilization of heterogeneous multiprocessor systems. The task scheduling problem has been proven to be NP-hard is the reason why we used meta-heuristic methods for finding a suboptimal schedule. In this paper we proposed a new approach using TRIZ (specially 40 inventive principles). The basic idea of thi...
متن کاملAn Efficient Genetic Algorithm for Task Scheduling on Heterogeneous Computing Systems Based on TRIZ
An efficient assignment and scheduling of tasks is one of the key elements in effective utilization of heterogeneous multiprocessor systems. The task scheduling problem has been proven to be NP-hard is the reason why we used meta-heuristic methods for finding a suboptimal schedule. In this paper we proposed a new approach using TRIZ (specially 40 inventive principles). The basic idea of thi...
متن کاملA New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability
Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data...
متن کامل